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SUMMARY 


We  present  an  iteration  procedure  to  locate  the  minimum 
of  a  continuously  differentiable  strictly  convex  function  over 
the  unbounded  simplex  in  Euclidean  n-space,  and  we  prove  that 
the  procedure  converges  to  the  unique  minimum.  This  procedure 
is  constructed  to  facilitate  its  adaptation  to  machine  program¬ 
ming.  Applications  of  this  procedure  to  maximum  likelihood 
estimation  in  certain  non-parametric  cases  are  mentioned. 
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1.  INTRODUCTION 


Recently,  there  have  occurred  some  applications  of  the 
problem  of  minimizing  a  convex  function,  say  f,  defined  on 
Euclidean  n-space  R^,  over  the  unbounded  simplex 

S  -  {(x  ,...,x  )  e  R  :  x.  <  x,  <  "•  <  x  }. 
l  n  n  l  -  l  —  —  n 

Special  cases  of  this  problem  arise  in  the  maximum  likelihood 
estimation  of  parameters  subject  to  known  constraints.  These 
problems  have  been  treated  by  Brunk  et  al  in  a  series  of 
papers  [1],  [2],  [3].  They  also  arise  in  the  maximum  likelihood 
estimation  in  certain  non-parametric  situations  as  treated  by 
Marshall  and  Proschan  [3], 

The  method  used  in  the  cases  cited  requires  that  the 
function  f  be  of  the  form 

n 

(1.1)  f(x1,...,xn)  - 

where  each  f ^  is  convex  over  R^. 

Let  u(j-r,  j+s)  be  the  value  of  x  which  minimizes 
j+p-1 

f^(x).  Then  the  minimizing  point  of  f  over  the 

i-j-r+1 


2 


simplex  S,  call  it  (a^,...,a  ),  is  known  to  be  given  by 

max  min  u(j-r,j+s) 
aj  *  r  >_  1  s  >_  1 

This  straightforward  method  works  for  non-parametric  estimates 
in  the  case  of  densities  with  increasing  failure  rates. 

The  general  problem  of  minimizing  a  function  over  compact  subsets 
of  in  the  case  the  function  is  strictly  convex  in  each  coordinate 

and  continuously  differentiable  and  assumes  the  minimum  in  the  interior 
has  been  treated  by  Warga  [6}.  He  proposes  the  use  of  an  iteration 
procedure  of  minimizing  successively  one  coordinate  after  another  beginning 
at  any  point. 

If  f  is  convex  and  continuously  differentiable  in  and  is  to  be 

minimized  over  the  bounded  simplex 

SM  *  ( (x. , . . .  ,x  )  z  R  :  -M  x  <_  •  •  •  1  x  M} 

for  some  M  >  0,  then  one  may  utilize  the  Warga  iteration  procedure  to 

find  the  minimum  of  f  over  the  n-fold  Cartesian  product  of  the  interval 

*  * 

t-M,M] .  Let  it  be  (a.,..., a  ).  Then  knowing  the  result  (proved  in  [3]) 

l  n 

*  * 

that  if  a.  >  a,.lf  then  the  point  (a.,..., a  )  which  minimizes  f  over 
j  j+1  1  n 

must  have  a^  ■  a^+^.  We  can,  in  at  most  n  applications  of  the 

Warga  procedure,  obtain  the  minimum  over  S„. 

M 

The  problem  which  arises  in  the  determination  of  the  nonparametric 
maximum  likelihood  estimate  in  the  case  the  density  has 
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a  convex  failure  rate  Is  that  of  minimizing  a  convex  function 
(which  does  not  have  property  (1.1))  over  the  unbounded  simplex. 
The  belief  which  prompted  this  effort  was  that  something  could 
be  found  which  was  alternate  to  the  method  above  for  some  M 
sufficiently  large. 


2.  THE  ITERATION  OPERATOR 

Let  f  be  a  function  which  is  to  be  minimized  over  the 
convex  set  S.  We  assume  that  f  satisfies  the  following: 

1°  For  any  a.BeS  and  te(0,l) 

f(tct  +  (1-t) 6)  <  tf (ot)  +  (l-t)f(B), 
and  letting  6^  be  the  vector  which  is  zero  in  every 
coordinate  except  the  i**1  which  is  unity,  we  have 


f(a  +  tfi .)  -  f(a) 

D  f(a)  *  lim  - ^  ■  , 

J  t  -*  0  t 

the  partial  derivative  with  respect  to  the  coordinate, 
exists  and  is  continuous  for  j  ■  l,...,n. 


2° 


If  {a*}  is  a  sequence  of  points  in  S  such  that  at 
least  one  coordinate,  say  the  has  the  property  that 


lim  3up  |  a*  | 

i  ->  oo  J 


then 


lim  sup  f (a*) 

i  ->■  00 


00 


Now  from  1°  it  follows  that  f  is  a  strictly  convex  continuously 
differentiable  function  and  from  2°  that  a  minimizing  value  of  f  over 


SteAis*»s»« 


i 


A* 


4 


t 


r 
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S  exists  which  by  the  strict  convexity  must  be  unique. 

For  any  vector  a  e  S  and  any  two  integers  j,k  such  that 
j  >_  1,  k  <_  n  and  k  >_  j  we  make  the  notational  convention 
for  x  a  real  number 


(2.1) 


(x:v,k,a)  ■  a  +  x(6j  +•••+  6^) 


where  6.  ■  (6,.,  6  .)  with  6..  the  Kronecker  delta, 

i  ii  l  i  ni  ji 


Now  if  we  use  A  as  the  difference  operator  Aa^  *  a^  -  aj_^* 
then  -Aa^  1  x  1.  ^aic+x  imPl*es  that  (x:j,k,a)  e  S.  Let  the  value 
of  x  e  (-Aa  ,  Aa  .  ]  which  minimizes  f(x:j,k,a)  be  denoted  by 

J  K+l 

p(  j.k.a) . 


LEMMA  1:  For  given  integers  such  that  j  >_  1,  k  n, 
k  _>  j  we  have 

p(j,k,a)  is  a  continuous  function  of  a  on  S. 


PROOF:  Let  j,k  as  required  and  a  t  S  be  fixed  arbitrarily. 
To  show  continuity,  let  {a°}  be  a  sequence  of  points  in  S  such 
that  a11  ■+  a  e  S.  If  p^  *  p(j,k,an)  does  not  converge  to 
p(j,k.a),  then  by  compactness  there  is  a  subsequent  such  that 


pn  -*■  P0  ^  p(j  *k,a) 


CASE  I  -Aa.  <  Aa,  . . .  Take  x  e  (-Aa . , Aa,  . . ) .  Then  for  that 
j  k+1  j  k+1 

x  there  is  an  N  sufficiently  large  that 


4 


Vf  •  *  \<r*.  -■*  V*  r  r-<  «*$*• 


and  then  for  i  >  N,  we  have 


ni  n£ 

f(Pn  sj»k,a  )  <_  f (x: j ,k,a  ). 

Now  letting  i  -*■  ®,  we  have  by  continuity  of  f  that 


f(p0:j,k,a)  <_  f  (x:  j  ,k,a) . 

This  inequality  holds  for  arbitrary  x  e  (-Aa^ ,Aa^+^)  and  by 
continuity  of  f  must  hold  for  all  x  in  the  closed  interval. 
Hence,  by  strict  convexity,  it  must  follow  that  pQ  is  the 
minimizing  value  and  by  definition  pQ  ■  p(1,k,a)  which  is  a 
contradiction. 


CASE  II  -Aa,  ■  Aa.  Since  Aa.  >  0,  we  must  have 
j  k+1  i  - 

Aaj  ■  "  0  and  since  an  a  it  must  follow  that 

|Aa”j  +  | Aaf1 . _  I  -*•  0  as  n  ■+  ®.  But  -Aa”  <  p  <  Aa”  .  and 
j  k+1  j  —  rn  —  k+1 

hence  p  -*•  0  *  p(j,k,a).  This  completes  the  proof, 
n 

We  now  define  the  transformations 


(2.2)  Ajk(a)  a  a  +  PCJ  *k,a) (6^  +• • •+  6^)  for  1  ^  j  <  k  <  n. 


Following  immediately  from  Lemma  1  we  make  the  obvious 


REMARK:  For  each  j,k  as  prescribed,  the  transformation  A^ 
a  continuous  map  from  S  into  S. 


is 


6 


in  turn,  we  set 


(2.3) 


r 


A 


n 


A  •  •  •  A .  . 
nn  jj 


n+l-r,n 


In 


•••  A22A11 


A2,r+lAlr 


where  juxtaposition  indicates  composition  of  the  transformations. 
Finally,  we  set 


(2,»)  B  -  A  • • •  A0A  . 

n  2  1 

Since  the  composition  of  continuous  functions  is  continuous, 
we  have 

THEOREM  1:  The  transformation  B  is  a  continuous  map  of 
S  into  itself. 


We  now  prove 

LEMMA  2:  If  A^(a)  4  a ,  then  f(A^(a))  <  f(a). 

PROOF:  By  definition 

f(Aj^(a))  f(a  +  x6j  +  ---+x6^)  for  all  x  c  [-AOj,Aa^+^] 


and  by  strict  convexity  we  obtain  equality  iff  x  *  p(j,k,a). 


Since  always  Oe  [-Aaj,Aa^+^]  for  a  e  S  we  have  f(Aj^(a))  f(a) 
with  equality  iff  p(j,k,a)  *  0.  Clearly  then  p(j,k,a)  +  0 
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Implies  f(Aj^(a))<  f(a)  and  by  equation  2.3  it  follows  that 

A.,  (a)  i  a  iff  p(J,k,a)  j*  0.  This  completes  the  proof. 

J  * 

i 

3.  PROOF  OF  CONVERGENCE 

I 

Pi.finition:  A  point  a  c  S  is  a  fixed  point  of  B  iff 
B(a)  ■  a. 

One  checks  easily  that  the  property  of  the  A^'8  expressed  in 
Lemma  2  is  preserved  under  composition.  Thus,  we  have 

THEOREM  2:  B  is  a  transformation  defined  from  S  into  s 
such  that  if  a  is  not  fixed,  then  f(Ba)  <  f(a). 

There  follows  immediately 

THEOREM  3:  If  y  is  the  unique  minimum  of  f  over  S, 
then  y  is  a  fixed  point  of  B. 

PROOF:  Otherwise,  by  Theorem  2  B(y)  t  y  which  would 

i 

contradict  the  fact  that  y  was  the  minimum. 

I 

I 

We  now  derive  some  properties  of  this  minimum  which  shall  be 
needed  subsequently.  We  define 

(3.1)  3f(a:8-a)  ■  lim  -^[f(a  +  tB-ta)  -  f(a)] 

t  *  0 

and  as  we  know  we  have 

n 

(3.2)  -  2  (B.-a.)D.f(a). 

1  111 

I 

f 

In  the  above,  we  have  followed  the  notation  of  [4J. 

\ 

\ 

I 
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LEMMA  3:  Suppose  the  minimum  p  of  f  is  such  that 


Vi  ‘ 


“k  ‘  “k+1  for  J  ‘  k 


then  it  follows  that  for  j  <_  r  <_  k 


(3.4.1)  3f(u:£  6i)  -  £  Dtf(u)  I  0, 


(3.4.2)  3f(y:£  dj) 

j 


E  Dif(p)  -  0 
j 


(3.4.3)  3f(u:£  6i) 

j 


£  DAf(u)  -  0 
j 


PROOF:  If  y  is  any  point  in  S,  then 


(3.5)  3f (u : y~u)  *  £  (Yi~ui)Dif (u)  ^0, 


otherwise  p  would  not  be  the  minimum  since  there  would  exist  a 
t  >  0  such  that 


“{flu  +  t(y-u))  -  f(w''  0 


which  would  be  a  contradiction  to  p  being  the  minimum.  Now  take 


UJ  -  X 


i  x  j  » •  •  • » r 


i  otherwise. 
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For  some  x  >  0  such  that  y  e  S  it  follows 

n  r 

£(Y,  -  P,)D.f  (u)  "  -x3f (u s  ^  6  )  ^  0. 

1  i  i  i  j 

This  proves  (3.4.2).  The  other  case  is  done  similarly.  The 
proof  of  (3.4.3)  follows  immediately  as  a  corollary  since  we 
must  have  the  l.h.s.  both  ^0  and  0. 

We  can  now  state  the  crucial 


l 


j 

l 


THEOREM  4:  If  a  is  not  the  minimum  of  f  over  S,  then 
is  not  a  fixed  point  of  B. 

By  (3.3)  we  have 

r 

(3.6)  3f(a:u-a)  *  (u^  -  a^)Djf  (a)  <  0. 

j-l 


If  we  consider  the  intervals  of  indices  across  which  is 


constant,  say, 


1  ■  l0^1  <  V  l2 . lN  -1  '  *N+1  -  n  +  1 


we  may  rewrite  (3.6)  as 


N  -1+? 


(3.7)  3f( 


a:p-a)  e  (ij-a^  <  0 

j  l  j  n  i 


i-1  j-4 


Since  a  is  a  fixed  point  of  B  we  must  have  over  each  interval 


v 


t 


in  which  is  constant, 
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(3.7.1) 


3f(a:6  +  *“+6  )  -  0  j  -  0,...,m 

J  J+1 


otherwise  a  would  not  be  a  fixed  point  of  A  .  . 

J*  j+1 

Moreover,  for  every  J  ■  0,...,m,  we  must  have  for  each 


j?  <  k  <  £, 

j  -  j+i 


(3.7.2) 


3f (a:6  +***+6 ,  )  <_  0 

J 


(3.7.3) 


3 f (a :  6, +•  •  *+6  .  )  1  0 

“  j+1 


otherwise,  a  would  not  be  fixed  for  the  appropriate  A  ,  's. 

J  » * 

Since  the  sum  in  (3.7)  is  negative,  it  follows  that  at  least 
one  of  the  summands  must  be  negative.  Suppose  without  loss  of 
generality  that  it  is  the  first.  If  we  also  consider  the 
subintervals  of  ^  <_  j  <  across  which  is  constant, 

say  for  some  N  >  1, 


<k9'  k9  I  J  <  . K  I  J  <  k„  ■’  " 


2’  2 


m 


m+1  2' 


We  have  from  (3.7)  by  using  (3.2)  that 


m 


(3. 7. A) 


Z  v\ 

i-0 


)3f(ct:6  +•  •  *+6  .  ,  )  <  0. 

i  i+1 


If  u  is  constant  across  <_  j  <_  ^  ®  *  1  in 

(3.7.4)),  then  we  have  a  contradiction  with  (3.7.1).  Thus,  we 
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assume  m  >  1.  Because  u  Is  an  increasing  function  of  1 

ki 

it  must  cross  a  at  most  once.  Thus,  we  can  write  (3.7.4) 

1 


as 


(3.7.5) 


o>(Z)  +  E  ](Vi),f(“:V'"+1' 

\i-l  i-s+1  J 


1+k 


). 


i+1 


On  the  right-hand  side  of  (3.7.5)  we  label  the  first  sum  I  and 
the  second  II  where 


(3.7.6)  M,  a  for  i  *  l,...,s 
ki  *1 


P,  >  a  for  i  *  s+l,...,m 
i  1 


and  perhaps  one  of  the  summations  is  vacuous. 

Consider  the  first  term  of  I  keeping  (3.7.2)  and  (3.7.6)  in 
mind.  Now 


•  ”,*1>5t(c,:6k1+"-+6-i+k2)  -°- 

Since  p,  is  an  increasing  function  of  i,  we  have  also 
i 


0  i  (,jk2-“ii1)3f(a!6k1+,"+£-i+k2)  -  (‘,k1-°k1)3f(“:<k1+-"+{-i+k2) 


By  combining  the  terms  i  *  1,2,  we  have 


I  >  >  ’  (M.  -a„  ) 8f (a: 6  +*,,+6.1+k  ) 

2  1  i  i+1 


+  (‘,k2-0t1)3£C“!5k1+‘"+<i-i+k3>- 
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Repeating  the  argument  s  times  we  see  that  eventually 
we  show 


I  >  <M  -a  )3f(ai«k  +-”+«.1+k  >  10. 

s  1  1  s+1 

A  similar  argument  holds  for  showing  II  >_  0  and  thus  we 
have  a  contradiction  to  (3.7.5). 

We  now  have 

THEOREM  5:  For  any  a  l  S  the  sequence  { B° ( a ) } 

converges  to  a  fixed  point  of  f. 

PROOF:  Let  a°  -  Bn(a)  for  n  *  1,2,...  then  the 
sequence  { }  has  a  convergent  subsequence  since  it  is  from 
the  set  (8  e  S:f(8)  f (a) }  which  is  closed  and  bounded  by 
virtue  of  2°  and  being  a  subset  of  Euclidean  n-space  is  compact. 
Set  a^  ■  f(an).  The  sequence  {a^}  is  a  decreasing  sequence 
of  real  numbers  bounded  below  and  therefore  converges  to  aQ, 
say.  Clearly  all  limit  points  of  {a11}  have  the  same  f  value, 
namely  a^. 

nk 

Let  a  -*■  y,  say,  then 

n,  n. 

bQ  -  f(y)  I  f B(a  K)  1  f (a  K) 


and  letting  k  -*■  °°  we  have 
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kUJnJB(a"k)  -  f(  Y). 

But  by  Theorem  1,  B  is  continuous  and  by  assumption  f  is 
also  thus  we  have 

nk 

lim  f B(a  K)  -  fB(y) 
k  -+•  00 

and  fB(Y)  *  f(Y)  and  Y  is  a  fixed  point  which  is  the  unique 
minimum. 
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